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Abstract 



We investigate in this paper an alternative method to simulation based recursive importance 
sampling procedure to estimate the optimal change of measure for Monte Carlo simulations. 
We propose an algorithm which combines (vector and functional) optimal quantization with 
Newton-Raphson zero search procedure. Our approach can be seen as a robust and automatic 
deterministic counterpart of recursive importance sampling by means of stochastic approxi- 
mation algorithm which, in practice, may require tuning and a good knowledge of the payoff 
function in practice. Moreover, unlike recursive importance sampling procedures, the proposed 
methodology does not rely on simulations so it is quite generic and can come along on the top 
of Monte Carlo simulations. 

Wc first emphasize on the consistency of quantization for designing an importance sampling 
algorithm for both multi-dimensional distributions and diffusion processes. Wc show that the 
induced error on the optimal change of measure is controlled by the mean quantization error. 

Wc illustrate the effectiveness of our algorithm by pricing several options in a multi-dimensional 
and infinite dimensional framework. 

Keywords: monte carlo simulation, importance sampling, stochastic approximation, vector 
quantization, functional quantification. 

Introduction 

In this paper, we are interested in one of the most basic problem of numerical probability which 
consists in the computation of the expectation 



where X : A, P) — )• {E, \ .\e) is a random vector taking values in a Banach space E and :£"—)• M 
is a Borel function such that E[F(X)^] < -|-oo. When the space E is M*^ (equipped with the 
Euclidean norm) we will refer to the finite (multi) dimensional setting and when the space E is 
C([0, r],]R'^) (equipped with the supremum norm) to deal with the case where X is a continuous 
path-dependent diffusion process, we will refer to the infinite dimensional setting. For instance 
in mathematical finance, computing the price of an option and the sensitivities of this price with 
respect to some parameters amounts to estimate such a quantity. When no closed or semi-closed 
formulas are available, one often relies on Monte Carlo simulation which remains the most widely 
used numerical method in this context. 
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Variance reduction methods are often used to increase the accuracy of a Monte Carlo simula- 
tion or reduce its computation time. The most common variance reduction methods are antithetic 
variables, conditioning, control variate, importance sampling and stratified sampling. Adaptive 
variance reduction methods have been recently investigated to take advantage of the random sam- 
ples used to compute the expectation above in order to optimize at the same time the variance 
reduction tool (see [11] and the references therein). In practice, it is not clear that this adaptive 
one step Monte Carlo procedure is better than the basic two step procedure: optimizing the vari- 
ance reduction tool (e.g. the optimal change of measure in an importance sampling framework) 
using a small number of samples, then computing the expectation of interest with this optimized 
parameter. 

In this paper, we are interested in variance reduction by importance sampling (IS). We denote by 
p the probability density function of X and we consider a parameterized family of density functions 



{Pd)e£Q- we will choose G = R in the finite dimensional setting and = L {[0,T],M.'^), where 
q is the dimension of the Brownian motion driving the dynamic of the d-dimensional process X, 
when dealing with the infinite dimensional setting. This general framework is the one investigated 
in [15j . In this introduction we will present the importance sampling paradigm in the finite multi- 
dimensional case, i.e. we set E = W^. We suppose that p > 0, — a.e., where A^^ denotes the 
Lebesgue measure on {M.'^,Bor{W^)) and we set po = p. Moreover, we focus throughout all the 
paper on importance sampling by mean translation, i.e. we will consider that for all 9 G W^, for 
all X G M"', pe{x) = p{x — 0). The basic idea of IS is to introduce the parameter 6 in the above 
expectation ([1]) using the invariance by translation of the Lebesgue measure, for every 9 



E[F(X)] = E 



F{X + 9) 



p{X + 9) 



p{X) 



(2) 



and among all these random variables with the same expectation, we want to select the one with 
the lowest variance, i.e. the one with the lowest quadratic norm 

y{x+9)- 



Q{9) := K 



F\X + 9)- 



p^X) 



A reverse change of variable shows that: 



Q{9) = E 



F'^iX) 



p{X-9) 



Now if the density function p oi X satisfies 

(i) p is log-concave and (ii) 

and 



\x\ 



lim p{x) = 



(3) 



(4) 



Q{9) < +00, G R'^ (5) 

then, one shows that the function Q is finite, convex and goes to infinity at infinity, thus arg mine Q = 
{9 €R'^ \ DQ{9) = 0}, where DQ is the gradient of Q, is non empty (for a proof, we refer to [15]). 
Now, if DQ admits a representation as an expectation, then it is possible to devise a recursive 
Robbins-Monro (RM) procedure to approximate the optimal parameter 9* minimizing Q, namely 



Xr,), n > 1 



(6) 



where {Xn)n>i is an i.i.d. sequence of random vectors having the distribution of X, (7n)n>i is a 
positive deterministic sequence satisfying. 



7n = +00, and ^ 7n < 



n>l 



n>l 



2 



and K is naturally defined by the formal differentiation of Q: for every x E W^, 



K{e,x) = F\x)^^Dp{x - 9). (7) 

IS by means of stochastic approximation has been investigated by several authors (see e.g. [8], [5] 
and [6]) in order to estimate the optimal change of measure by a RM procedure. It has recently 
been studied in the Gaussian framework in [1] (see also [2]) where ([7]) is used to design a stochastic 
gradient algorithm. However, the regular RM procedure ([7]) suffers from an instability issue coming 
from the fact that the classical sub-linear growth Assumption in quadratic mean in the Robbins- 
Monro Theorem 

v0gm'=', ¥.[K{e,xfY <c {i + \e\) (8) 

is only fulfilled when F is constant, due to the behaviour of the annoying term p{x)/p{x — 6) as 9 
goes to infinity. Consequently, On can escape at infinity at almost every implementation as pointed 
out in [1]. To circumvent this problem, a "projected version" of the procedure based on repeated 
reinitializations when the algorithms exits from an increasing sequence of compact sets (while the 
step 7n keeps going to 0) was used. This approach is known as the projection " la Chen" (see 
for instance [3], [4] and [H]). A central limit theorem for this version of the recursive importance 
sampling algorithm is proved in |13| . This kind of technique forces the stability of the algorithm 
and prevents explosion. From a numerical point of view, this projected algorithm is known to 
converge if the sequence of compact sets have been specified suitably, which is not an easy task in 
practice. 

In [15], the authors propose a third change of variable to plug back the parameter 6 into F 



DQ{e) = E[K{e,x)] = E 



2.^ m P\X-9) Dp{X-29) 



F\X - 9) 



p{X)p{X-9) p{X-29) 



(9) 



which has a known controlled growth rate at infinity in common applications. For instance, if we 
suppose that there exists A > 0, such that 

VxGM'^, \F{x)\ <Ce^l^l, 

we can define a new function H by setting 

H(e,x) := e-2^l^l F\x - g^^^i^^_^£Pi^^3^ (10) 
' p{x)p{x — 9) p{x — 26) 

Under additional assumptions on the density function p, they derive a new stochastic approximation 
algorithm using H which satisfies the sub-linear growth condition ([8]) so that it a.s. converges 
and is stable without having to apply projection technique. They also extend this construction 
to exponential change of measure (Esscher transform) and to diffusion process using Girsanov 
transform. However, from a numerical point of view, the tuning of the algorithm needs a good 
knowledge of the behavior of F at infinity. In practical implementations, recursive importance 
sampling methods using stochastic implies specific tuning of the step sequence or the sequence of 
compact sets which in both cases strongly depends of the payoff function F. 

In order to get rid of this problem, [12j proposes an optimization Newton's algorithm to estimate 
the optimal change of measure in a Gaussian framework. They approximate DQ{9) and D'^Q{9) 

(based on the representation ([7]) with p{x) = -j^e ~ ^ x G M ) using Monte Carlo simulation 
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with n samples 

1 " 2 

n ^-^ 

k=i 

1 " 2 

k=l 

where (^fc)i<fc<n is an i.i.d. sequence of d-dimensional standard normal vectors. The (unique) 
minimum of Q is approximated by the (unique) zero On of Qn which can be computed using a 
deterministic Newton-Raphson algorithm. Moreover, several asymptotic properties are addressed. 

Deterministic optimization using a large deviation argument has been investigated in [7]. The 
optimal change of measure is selected as a local maximum of ^ ^ \og\F{e)\ - ^. This can be 
achieved only under some regularity assumptions on the function F. Moreover, from a theoretical 
point of view this choice is not optimal. 

In this work, we propose and study an alternative deterministic procedure which is not adaptive 
and does not need specific payoff-dependent tuning. The optimal change of measure is estimated 
using a robust and automatic Newton-Raphson's algorithm combined with optimal (vector and 
functional) quantization. The main advantage of our procedure is that it can be used as a generic 
variance reduction method which comes upstream the Monte Carlo simulation framework. More- 
over, we will focus on one very common situation in mathematical finance, that is, Monte Carlo 
simulation that are based on multi-factor Brownian diffusions. Indeed, the presented method easily 
extends to this framework using quadratic optimal functional quantization of stochastic processes 
(see e.g. [I9]). It is particularly adapted for financial institutions since the methodology we propose 
can come along on the top of Monte Carlo simulations. Numerical tests ilustrate the effectiveness 
of our approach in both multi-dimensional and infinite dimensional frameworks. 

The paper is organized as follows. Section 1 presents several results about vector and functional 
quantization that are required in the following. We focus on the functional quantization of diffusion 
processes. Section 2 presents the quantization based recursive importance sampling algorithm. The 
emphasis is on the consistency of quantization for designing an importance sampling algorithm 
for both multi-dimensional distributions and diffusion processes. We show in particular that the 
induced error on the optimal change of measure is controlled by the mean quantization error. In 
section 3, we provide numerical experiments of our approach by considering option pricing problems 
arising in mathematical finance. 

1 Some results on optimal quantization 

Before, dealing with the construction of the quantization based IS algorithm, we provide with some 
background on quantization of Hilbert spaces and Gaussian processes viewed as L|i-valued random 
vectors. 

1.1 Introduction to quantization of random variables 

Let G N*. The principle of the A^-quantization of a random variable X taking its values in 
a separable Hilbert space E is to study the best ||.||p-approximation of X by i?- valued random 
vectors taking at most A'^ values. The norm ||.||p is the usual norm on L^(r2,P) defined by ll-'^Hp = 

(E [\X\%]f'''. When p = 2, we talk about quadratic optimal quantization. If ii^ = M , one speaks 

about vector quantization. When E is an infinite dimensional Hilbert space like L^([0, T],di) 

1 

endowed with the usual norm 1/1^; = ( f{t)'^dt\ , we talk about functional quantization. 
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Definition 1.1 (Voronoi tessellation). Let > 1 and x := (xi, • • • ,xj\[) G E'^ be a A^-tuple re- 
ferred to us by a A^— quantizer and let Proj^ : E — t- {xi, ■ ■ ■ ,xn} be a projection following the clos- 
est neighbour rule. Then, the Borel partition C = {Ci, • • • , C^} of E defined by Cj :=Proj~^(3;j), 
i = 1, - ■ ■ ,N and satisfying 



Fioi^^iixi}) C l^y e E, \xi 



y\E 



mm \Xn 



y\E 



l<i< N. 



is called a Voronoi tessellation of E induced by x. 

One defines the Voronoi quantization of X induced by x by 

N 

X- = Proj,(X) = ^xa{X6C.}- 
1=1 

The discrete random variable X^ (one will sometimes simply write X if there is no ambiguity) 
is the best L'P(P)-approximation of X among all measurable random variable taking values in 
X := {xi, ■ ■ ■ ,xn}. In fact, for any random variable y : Q — )• {xi, ■ ■ ■ ,xn}, we have 



X-X 



<\X -Y\ 



a.s. 



so that 



X-X 



< \\X - Y\ 



X-X 



For any fixed A^-quantizer x := {xi,--- ,xn), we associate the LP(P)-mean error 

induced by x. One aims at finding a A^— tuple x G E^ which minimizes the L^-mean error over 
E^ . It amounts to minimizing the function 



Qn ■ (^ij ■ ■ ■ 











X-X 








p 



mill I X 

l<i<N 



Xi\E 



The function Q-^ reaches its minimum at one (at least) tuple x* called an optimal A^-qiiantizer. 
This infimum is in general not unique, except in some cases, in particular when d = 1 and the 
density of X is log-concave. Note that card(3;*)= A^ if card(supp(Fx))^ Moreover, the L^- 
mean quantization error p '■= minQ^ converges toward and for non-singular R'^-valued random 
vectors, the rate of convergence of convergence of ^ is ruled by the so-called Zador Theorem (see 

m). 



Theorem 1.1. Assume that X G L 



for some 6 > 0. Let f denote the density of the absolutely 



continuous part of Fx (possibly, f = 0). Then, 

qp{d) 

where qp{d) is a strictly positive depending only on p and on the dimension d. 



lim N^/'^e^, 



\ {d+p)/dp 

fixf/'^+Pdx] 



For further theoretical results on optimal vector quantization we refer to |10] . One of the 
important issues for the Numerical Probabilities viewpoint is to compute the optimal quantizers 
and the associated weights (for the applications to Numerical Probabilities in the finite dimensional 
case, see the seminal paper jl8|). However, due to the non-uniqueness of the optimal quantizers 
in the general framework, specially when E = with d > 2, we are usually leaded to search 
for stationary quantizers, i.e. A^-quantizers x satisfying \/Q'^{x) = 0. We will see further on that 
stationary quantizers are an important class of quantizers for numerics. The commonly used result 
is the quadratic case (when p = 2) recalled below. 
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Definition 1.2 (Stationarity). Let x := {xi, ■ ■ ■ ,X]\[) G E'^ be an A^-quantizer and C := {Ci, • • • , Cjv} 
its associated Voronoi partition. The random vector X is called a stationary A^-quantization of X 
if it satisfies 

V ? / j, Xi / Xj and P (X G Ui(9a(x)) =0 (12) 
(Pj)f -negligible boundary of the Voronoi cells) and 



E 



X I X 



X. 



We have in particular, E[X] = E[X]. 



1.2 Some backgrounds on functional quantization 

A rigorous extension of optimal vector quantization to functional quantization is done in [16]. The 
vector quantization problem is transposed to random vectors in an infinite dimensional Hilbert 
space, in particular, to stochastic processes (Xt)^gjQyj viewed as random vectors with values in 
L'^ {[0,T],dt). In [20], numerical performances of quadratic functional quantization with appli- 
cations to finance is investigated. In particular, the roles played by product quantizers and the 
so-called Karhunen-Loeve (K-L) expansion of Gaussian processes are pointed out. 

In what follows, we will start by the functional quantization of the standard Brownian motion 
since everything can be made explicit for this process. Then we will show how to construct from 
optimal quadratic functional quantization of Brownian motion explicit (non- Voronoi) quantization 
of Brownian diffusions. 

Assume that the separable Hilbert space E is := L^([0, T], dt), with {f,g) = f{t)g{t)dt. 
One defines the covariance operator Cw of the Brownian motion iWt)t£[o,T]j for every / € by 

Cwif) := E [(/, W) W]=(^t^ £ fis){s A t)ds^ . 

This operator is symmetric positive and can be diagonalized in the K-L orthonormal basis of L'^ 
(en)n>i with eigenvalues (A„)„>i given by 




n > 1. 



Moreover, one may expand the paths of (VF)(g[o^T] on this basis, i.e. 

w''^Y.{W,en)en, r-a.s. (13) 

Using Fubini's Theorem and the orthonormality of the K-L basis, one obtains for i > 1, p > 1 

E [{W,e,) {W,e.p)] = {Cw{ee),ep) = A^^p 

where 6ip denotes the Kronecker symbol. Consequently, the Gaussian sequence ((M^, e^))^>^ is 
pairwise non-correlated so that these random variables are independent. Hence, (|13p can be written 

n>l 

where ^„ := {W,en) /^/Ki, n > 1, is an i.i.d. sequence of random variables with standard normal 
distribution. 
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Now the idea of product functional quantization using at most N elementary quantizers is 
to quantize these random coordinates for every n > 1, one considers an optimal A^^- 

quantization (iV„ > 1) of denoted where ^„ := Proj^,^ iCn), Xn ■= ix 



N„ 
1 ' ' 



,4:) is the 
,iVn > 1. 



unique optimal A^^^i-quantizer of the normal distribution and A'^i x • • • x Nn < N, Ni, 
For n large enough, we set Nn = 1, = (which is the optimal 1-quantization) and we define the 
product quantizer by (the finite sum) 



n>l 

The product quantizer x that produces the above Voronoi quantization W is defined by 
Xi{t) = ^ VKxi^^enit), i = ih,--- ,in,- ■ ■) ^ Yii'^,- ■ ■ 

n>l n>l 

and for every multi-index i G HnM " " " i ^n}, the associated Voronoi cell of x is 

Ciix) = llV^Ci^iXn). 

n>l 

Moreover, from the independence of the normal random variables {£,n)n>i the weights P (w = Xt 
can be computed explicitly 



{W = X^^ = ^ Ci„iXn)) . 



n>l 

For numerical purposes, one may be interested by the theoretical rate of convergence for the quan- 
tization error of the Brownian motion and the stationarity of K-L product quantizer. 

Proposition 1.2 (stationarity, see [20]). The product quantizer of the Brownian motion defined 
above is a stationary quantizer, i.e. 



K 



W I W 



W. 



Proposition 1.3 (convergence rate, see |16j). For every N > 1, there exists an optimal product 
quantizer of size at most N, denoted W of the Brownian motion defined as the solution to the 
minimization problem 



mm 



W-W 



.Nn>l, dN ■■= Ni X ■ ■ ■ X Nn < N, N > I 



(14) 



Furthermore, these optimal product quantizer induces a rate optimal sequence, i.e. 



W-W 



T 



< C- 
2 (logA^)t 



for some real constant C > 



To conclude this section, we shortly describe a constructive way to quantize scalar brownian 
diffusions (for more details see e.g. [20]). The rate is O ^(logA^)^^^ 1[]^q for the Brownian motion 
as soon as the diffusion coefficient is not too degenerate. Consider the homogeneous Brownian 
diffusion process: 

dXt = b{Xt)dt + a{Xt)dWt, Xo = xo, 
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where b and a are continuous on M with at most hnear growth (i.e. \b{x)\ + |c(x)| < C(l + \x\)) 
so that a weak solution to the equation exists. Let {x^)n>i be a sequence of rate-optimal K-L 



product quantizers of the Brownian motion. For every multi-index i G OnMi^'' 
A'^i X • • • X Nn < N, consider Xi the solution of the following integral equations 

dx^it) = (b{x^it)) - laa'{xi^[t))] dt + a{xi{t))dxi_{t), 



,Nn}, with 



(15) 



where a' is the first derivative of a. To simplify notations we consider the simpler notation i for 
the multi-index i. Now set 

N 

Xi = ^x,(t)lc,fc)(VF), N>1. (16) 

i=l 

The process Xt is a non-Voronoi quantization but it is easily computable once the above integral 
equations are solved since the weights P = Xi^ are known. However the ODE (|15|) has no 
explicit solution in general. Then, for numerical implementation purposes, we use discretization 
schemes like Runge-Kutta one to estimate these quantizers. One shows that the quantized process 
X converges toward the process X with respect to the quadratic norm and the rate of convergence 
is given by the following result. 

Proposition 1.4 (See ^7\)- Assume that b is differentiable, a is positive twice differentiable and 
that b' — b- icra" is bounded. Then 

(7 Z 



X -X 



O ((logiV)~2 



1.3 Quadrature formulae for numerical integration 

We conclude this first section on optimal (quadratic) quantization by illustrating how to use it for 
numerical integration of functions defined on the Hilbert space E. We provide some quadrature 
formulae using the above quantization errors. We refer to [20j for the proofs. The main idea is that 
we know that X is close to X in distribution and if one has a numerical access to the A^-quantizer 
X with the associated weights sequence (P {X S Ci{x)))^^-^j^ of the quantization X then for every 
Borel functional F : H ^ M, the computation of the expectation 



E 



N 

F{X)\ =^F(a;i)P(XGCi(x)) 



2=1 



is straightforward. The proposition below gives some error bounds for E [^(X)] — E[F(X)] based 
on L^-quantization error (p = 2 or 4) of X — X 

Let X be a stationary quantizer for X with X its associated Voronoi quantization and F : E 
be a Borel functional defined on E. 

(i) Inequality for convex functionals: If F is convex then 



E 



F{X) < E [F{X)] 



(ii) Lipschitz functionals: 
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— If F is Lipschitz continuous then 



E [Fix)] - E Fix) < [F]L^p 



X -X 



— Let 6 : E ^ M+ be a nonnegative convex function such that OiX) £ L^(P). If F is 
locaUy Lipschitz with at most ^-growth, i.e. — -F(y)| < [Fj^jplx — y\ iOix) + 6iy)) 

then Fix) G L'^iF) and 



E[FiX)]-E Fix) <2[F]Lip 



X -X 



\\oix)\\,. 



(iii) Differentiable functionals: If F is differentiable on E with an a-Holder differential DF (a G 
[0,1]), then 



E [Fix)] - E Fix) < [DF], 



X -X 



l+a 
2 



Other quadrature formulae can be derived based on regularity assumptions on F (for more 
details we refer to |20j). 



2 Quantized importance sampling algorithm 



2.1 The finite-dimensional setting 

In order to derive the existence of a unique minimum for the function Q, we make the following 
assumption: 



Assumption 1. 



p is strictly log-concave and lim pix) = 0. 



Moreover, the differentiation of the quadratic norm Q (with respect to 9) defined by Q is required 
further on and we need for this purpose the following assumptions on the probability density 
function p: 

Assumption 2. The density function p is twice differentiable and satisfies for some a G (0, 1] 



Dp 
P 



0(|x|") as X ^ +00. 



(ii) 3 C >0 such that Vx G R'^, ^ \D'^p\ (x) < C + j^^j , where D^p is the Hessian of 

p. 

Proposition 2.1. Suppose that Assumptions 1 and 2 are satisfied and the function F satisfies 



M 9 eW, E 



F'^iX) 



P{X) 
piX-9) 



< +00 and V M > 0, E 



F2(X)|Xre*'^l^l^ 



< +00. 



Then, the function Q defined by ([3D is finite, strictly convex, differentiable on W^, goes to infinity as 
[9[ goes to infinity. As a consequence, the function Q admits a unique global minimum 9* satisfying 



ArgminQ = G M"' | DQi9) = o} = {9*} 



and the gradient is given by 



DQi9) = E 



(17) 
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Moreover, if F satisfies 
and, 



V M > 0, E F2(X)|X|2°e^l^l 
t/ien, Q is twice differentiahle and its hessian is given by 



V M > 0, sup E 
ie|<A/ 



D^Q{9) = E 



F2(X) 



p{X 



< +00 



< +00, for some p > I, 



p{X) ( Dp{X - e)Dp{X - OY D'^p{X 



p2(X - e) 



p{X - 9) 



(18) 
(19) 

(20) 



Proof. For every x G W^, 9 i— )• \ogp{x — 9) is strictly concave so that — logp(x — .) = log( ^^J_ ^ ) is 

strictly convex, hence the function 9 i— t- e^°^^p(^-*'^ = ^^^^^^ is strictly convex. Combining Fatou's 
lemma and Assumption 1, one easily obtains that lim|5i|^_|_oo Q{9) = +oo. 

In order to get the formal differentiation representation (jl7p we have to check the domination 
property for 9 G B{0,R), for every R > 0. The log-concavity of p implies that for every a; G M'^ 
and 9 G B{0,R), 

{Dp{x-9),9) 



logp{x) < logp(rE — 9) + 

Hence using Assumption 2 (i) yields 

p{x 



< 



p{x — 9) 



< e 



p{x — 9) 



(21) 



- ^) ^ CrF\X){\X\^ + l)e^«l^l" G lH 

p [X — 9) 



so that 



To justify the formal differentiation of DQ to get (j20p we proceed as follows. Let x G 
Assumption 2 yields for every 9 G B{0,R) 

p{x) (\Dp{x-9)Dp{x-9Y[ 



Using 



and, 



p{x — 9) 
p{x) 



p^{x-9) 



1 2a 



+ 1) 



p{x — 9) \p{x — 
Consequently, V 6' G i?(0, i?), we have 



D'^pl {x-9)] <Cr{ e^«l^l"|x|2" + 



oCr\x- 



|1-Q 



F\x)- 



p{x) [ Dp{x - 9)Dp{x - 9y D^p{x - 9) 



<CR{f{x)+ge{x)) 



' p{x — 9) \ p'^{x — 9) p{x — 9) 

where /(x) = F^(x)e'"^l^l" + l) and ge{x) = F'^{x)j^—^prr^e'-'^^^~^^". Using the assumption 

p8]) imphes that f{X) G ^^(P). Moreover, using another change of variable and (j2ip yields 



E [gliX)] = E 



F^PiX + 9)e 



Cp\X\- 



1 p{X + g) 
p{X) 



< E 



F2p(x + ^)e^l^l^ 



so that (|19p implies for every R > 



sup E [c/^(X)] < +00. 

Consequently, the family {ge{X))g^B(o,R) is F-uniformly integrable. This provides the expected 
representation ([20]) . □ 
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Examples of distributions 



> The Normal distribution 



p{x) = (27r)-ie-l"l'/2, 



X G 



Assumption 1 is clearly satisfied. Moreover, we have = —x and ^^^^^^ = xx"^ 

where Id is the identity matrix of size d, so that Assumption 2 is satisfied with q = 1. A 
carefull reading of the proof of Proposition 2.1 shows that assumption (|19p is useless and one 
only needs p8|) . Moreover, we have 



DQ{e) = E 
D'^Q{e) = E 



2 



> T/ie logistic distribution 



p{x) 



\2 ' 



(e^ + 1)^ 

Assumption 1 is satisfied. Assumption 2 holds with a = 1. 
[> The hyper- exponential distributions 

p{x) = Cd,a,ae-^P{x), X G M, a G [1, 2] 

where P is a positive polynomial function. 

The main idea is that we know that Q, DQ and D^Q can be approximated by 



Qn{0) :-- 
DQn{0) :-- 



E 



E 



E 



p{X - 9) 



fHx) Dp{x-e) 



p^{x-e) 
pjx) 

p{X - 9) 



^Dp{X - 9)Dp{X 
' p^{X-9) 



D^p{X 



p{X-9) 



(22) 
(23) 
(24) 



using an A^-quantizer x with the associated weights sequence (P (X G Ci{x)))i^^^^ of the quan- 
tization X. The computations of (j22p . (I23p and (|24p are straightforward. For A'^ large enough, 
F{xi) / for some i G {I,-- - ,A^}, so that Qat is also strictly convex and goes to infinity as 
1^1 goes to infinity. Moreover, it is clear that Qat is differentiable. Hence, there exists a unique 
9^ G such that ArgminQAr = G M*^ | DQn{9) = o| = {^^}- Moreover, for N large enough, 

the Hessian matrix D'^Q]^{6) is symmetric positive definite for every 9 G M*^. 

The following proposition describes the asymptotic behavior of 9^ as N ^ +oo. It shows, as 
expected, that 9^ — t- 0* as — )• +oo. First we need the following assumption: 

Assumption 3. The function F is positive, convex on M"' and satisfies 

(i) F is Lipschitz on M'^, 
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(ii) 3 a>l, such that E [F^^iX)] < +00 and for all M > 0, E + e^l^l 



< +00. 



Proposition 2.2 (Convergence of (^^)Af>i)' Consider an L"^ -optimal stationary quantizer x of 
size N with its associated quantization X. Assume that the assumptions of Proposition 2.1 and 
that Assumption 3 are satisfied. Then, we have 

e* and Qn(0^) Q{0*) as N ^ +00, 

where 6^ is the unique global minimum of Qn defined by (j22p . 

Proof. The first step of the proof consists in showing that the function Q]\r converges locally 
uniformly to the continuous function Q. Let x E M'^ and x' S M'^. We have 



p{x — 0) 



F\x') 



p{x' - 9) 



[F\x)-F\x')) 



p{x) 



p{x — 9) 



+ F\x') 



p{x) 



p{x') 



< 



[F]Lip\x-x'\ (|F(x)| + |F(x')|) 



p{x 



p{x — 9) 



+ F\x'' 



p{x — 9) p{x' — 9) 
p{x') 



p{x) 



p{x — 9) p{x' 



Using the log-concavity of p, Assumption 2 and the inequality \e^ — e^\ < \u — v\ {e^ + e") yields, 
for every 9 e B{0,R) 



p{x) p{x' — 9) 



p{x — 9) p{x' — 9) 



< 



< 



log 
log 



p{x) 
p{x'] 
p[x 



p{x') 



log 

+ 



log 



p{x — 9) 
p{x' - 9) 
p{x — 9) 



p{x) ^ p{x — 9) 



p{x') p{x' 



p{x' - 9) 



p{x) p{x - 
p[x') p{x' 



< Cr |x - x'l (1 + |x|° + |x'|") (e^l^l + e^l^'l 
<Cr\x- x'l (1 + |x| + |x'|) (e^l^l + e^l^'l 



Consequently, we have 



2r^\ P(^) Z72/^/N Pi^') 



F\x) 



p{x 



p{x' — 9) 



< C\x - x'l ((|F(x)| + |F(x')|) e^l^l'' 



+f2(x')(1 + |x| + |x'|) e^l^l + 



.c\x\ , n\x'\ 



hence, Schwarz's and Holder's inequalities implies for every 9 € 5(0, i?) 



Qi9) - Qn{9) 



< E 

< C 



p{X-9) 



p{X - 9) 



< C 



X -X 



X -X 



[F{X) + F{X)) e^l^l" 



+ 



F\X){l + \X\ + \X\] e^l-^l+e 



.c\x\ , n\x\ 



\\F{X)\\,^+ F{X) 



2a 



+ 



2a 



F\X) ^ [l+\\X\\,a + 



X 


( 




+ 












A CI 
a-1 
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Now F is convex since F is and u i— )■ u is increasing and convex on M^. Consequently, by 
Jensen's inequality 



E 



E 



p2a /jg 



X \x 



< E 



E 



F^^-iX) \x]] = E [F'^'^iX]] . 



Using similar arguments, we have: 



< \\F^{X)\ 



2a 



2a' 



X 



< \\x\ 



aC\X\\ 



4-2- 

a — 1 



4_S_, 

a — 1 



,c\x\ 



< 



a . Finally, we obtain for some positive constant C independent of 9 and 



— ^ 0, as iV — > +00. 



Q{9)-Qn{9) <C X-X 

So that Qtv converges locally uniformly to Q. 

Now let e > 0. Q being continuous at 9* and strictly convex 

r?:= inf Q{9) - Q{9*) > 0. 
e:\e~e*\>e 

The local uniform convergence of Qn to Q ensures that 

3 Nr„ ViV > Nr„ y9 G m.'^ such that \9 - 9*\ < e, then 



Qn{9)-Q{9) <7]/3. 



Assume that 9 satisfies \9 — 9*\ > e. The convexity of Qn implies that 



Qn[9*+ e- 



< 



-,Qn{9) + 1 



Qn{9*) 



so that, 



Qn{9) - Qn{9*) > 



> 



Qn + 



-Qn (9*) 
Q (9*) - 2r]/3 



>rj/3. 



Since 9 is the unique global minimum of Qn, we have Qn{9 ) — Qn{9*) < 0. Consequently, 
\^qN _ ^ g fQj. jY > and {9^)j<i>i converges to 9* . Combining the local uniform convergence 
of (QAr)Ar>i to Q and the continuity of Q at 9* , we obtain that Q 
This concludes the proof. 



N[(^" ) Q{9*), as N ^ +00. 

□ 



A classical method for estimating 9^ , i.e. for solving the system of nonlinear equations DQn{9) 
is Newton-Raphson's algorithm: 



9n+i = 9n-D^QNi9n)-^DQN{9n), u > 0, given. 



(25) 



Newton-Raphson's algorithm is attractive because it converges rapidly from any sufficiently good 
initial guess under standard assumptions. Indeed, since 9^ is the unique solution of DQn{9) = 0, 
DQn is continuously differentiable on and D'^Qn{9) is a symmetric positive-definite matrix 
for all 9 € W^, then the sequence {9n)n>o defined by (j25p is known to converge toward 9^ if is 
sufficiently close to 9^ . 

At this stage, it is natural to characterize the rate of convergence of {9^)n>i to 9*. To this end, 



first we need to obtain some error bounds for 



DQ{9) - DQn{9) , 9 £ B{0, R), for some R>0. 
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Assumption 4. The function — is a-Holder. 



Proposition 2.3. Assume that Assumptions 1, 2, 3 and 4 hold. Let x be an L"^ -optimal stationary 
quantizer of size N and X the associated Voronoi quantization. Then, for every R > 0, for every 
9 e B{0,R) 



O 



X -X 



DQ{6)-DQn{6) 
Hence, DQn converges locally uniformly to DQ. 
Proof. Let x G M.'^, x' G R'^ and Q G B(0, R). We have 



, N +CX). 



F'^(x) 



p{x)Dp{x - 9) _ 2. ,^p{x')Dp{x' 
p^ix-9) 



p(x) 



p'^{x' — 9) 



[F\x)-F\x')) -^^^Mx-9) 



+ F\x') 



p^{x-ey 

p{x) p{x') \ Dp{x — 9) 



^p{x — 9) p{x' — 9) J p{x — 9) 
, rf. Pj^') ( Dp{x-9) _ Dp{x'-e) \ 
' p{x' - 9) \ p{x - 9) p{x'-e) )' 



First we take care of the first term of the above sum. Using Assumption 2 (i), Assumption 3 (i) 
and (HH), yield 



{F'^{x)-F'^{x')) 



p{x) 



p'^{x — 9) 



Dp{x - 9) 



<Cr\x- x'\ {F{x) + F(x')) (1 + |x|")e^l^l. 



Now, we focus on the second term. Using similar arguments than the ones used in the proof of 
Proposition 2.2 and Assumption 2 (i) yield 



F^(x') 



p{x) p{x') 



p{x — 9) p{x' 



\Dp{x 



p{x — 9) 



< Cr\x - x'\F^{x') (1 + \x\ + \x'\) (e^l^l + e^l^'l 



Finally, using (|2ip and Assumption 4 for the last term implies 



J^ V{x') 



p{x' - 9) 



Dp{x - 9) Dp{x' 



p{x — 9) p{x' — 9) 



< Cr\x - x'|"F2(x')e^l^'l 



Using similar arguments than the ones used in the proof of Proposition 2.2 yields 



DQ{9) - DQn{9) 



< E 
<Cr 



^^^^^ p{X)Dp{X-9) _ ^^^^^p{X)Dp{X - 9) 



p2(A 



p2(x - 9) 



X-X 
X -X 



+ 



X-X 



as N ^ +00. 



□ 



Now we are in position to characterize the convergence rate of {9^)]^>i toward 9*. The following 
result shows that as expected the error \9^ — 9*\ is controlled by the quantization error. 

Theorem 2.4. Assume that Assumptions 1, 2, 3 and 4 hold. For every N > 1, there exists rf^ > 
such that, if \9q — 9^\ < rj^ , then the sequence {9n)n>o defined by (p5|) converges to 9^. Moreover, 
the convergence rate of {9^)j\i>i to 9* is based on L"^ -quantization error in the sense that 



9*\ = O 



X-X 



as — 7- +00. 
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Proof. We define the norm \9\^ := \D'^Q{9*)9\, for 9 e R'^. Since D'^Q{9*) is non-singular, 

-1^1 < 1^1* < k\9\, for 9 G R'^, 

where k := max {\D'^Q{9*)\;\D'^Q{9*y^\) . Choose (5 sufficiently small that 2k5 {1 + k5) < ^. We 
can choose e > sufficiently small such that if |0 — 0*| < e then 



\d'^Q{9) - D'^Q{9*)\ < 6, 

\D'^Q"\0) - D'^Q^\9*)\ < 5, 

\DQ{9) - DQ{9*) - D^Q{9*){9 -9*)\<6\9 



(26) 
(27) 
(28) 



The continuity of D^Q at 9*, the non-singularity of D^Q{9*) and the continuous differentiability 
of DQ at 9* ensures the existence of such an e. The existence of an rjN such that the sequence 
(^n.)n>o converges to 9^ if |^o — < follows using the same arguments applied to Qn- 

The convergence of {9^)j\f>i to 9* ensures the existence of A^o £ such that if A'" > Nq, 
\qN _ < Hence ([26]), ([27]) and ^ hold for 9 = 9^, N> Nq. 

Now, the recursive algorithm (I25p can be written 

= 9n- D^Q{9nr^DQ{9n) + rn + Sn- (29) 

with Vn G N, r„ = D2q(^„)-i (^I)Q(^„) - DQ{9n)) and s„ = (D2g(^^)-i _ D^Q(9n)-^^ DQ{9n). 
Using the following equality 



D^Q{9*) [9n-9* -D' 



DQ{9n, 



D''Q{9ny^ - D 



D'Q{9n) - D'Q{9*) {9n - 9*) + ( DQ{9n) - DQ{9*) - D'Q{9*){9n - 9* 



and taking norms in (|29p yields 



< {l + \D^Q{9*) 



D^Qid^r' - D^Q{9*y' ] ( D^Q{9n) - D^Q{9*) 



\0n - 0* 



DQ{9n) - DQ{9*) - D^Q{9*){9n -9*) + \D^Qi9*)rn\ + \D^Qi9*)sn 



(30) 

Let N > No and rjN > such that |^o — ^ Vn so that {9n)n>i converges toward 9^ . The 
continuity of DQ, DQ, D'^Q and D^Q at 9^ and the non singularity of D^Q{9^) and D^Q{9^) 
yield 



\D'Q{9*)rJ ^ D'Q{9*)D^Q{9^)-\DQ{9'') - DQ{9'')) 



5W\ 



and 



as n — 7- -|-oo. 



Letting n goes to infinity in ([30]) and using ([26]), (El]), ([28D for 6* = 6*^, > A^o, iniphes that 



+ 



D''Q{9*)D^Q{9^)-^ DQ{9'^) - DQ{9 



< 2k6 (1 + k6) 
Moreover, for A^ > A'^g, using ()27p we have 

D^Q{9*)D'^Q(9^y^^ = \d^Q{9*) (^D^Q{9^y^ - D^Q{9*y^ ) + < {1 + kS) . 
Finally, Proposition 2.3 and the choice of 5 yield 



< c 



X -X 



N>No, 



so that 



9^-9' 



O 



X -X 



as N ^ +00. 



□ 
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Remark 2.1. • When X is a d-dimensional gaussian vector, Assumptions 2 and 4 are satisfied 
with a = 1 so that if x is an L^-optimal quantizer with its associated Voronoi quantization X, 
Theorem 1.1. imphes the following error bound in Theorem 2.4 

<CN^^, as N +00. 

In practice, only a rough estimate of the optimal change of measure parameter 6* is needed. 
According to our numerical results, optimal quantization grids of size N ~ 200, 500 (depending on 
the dimension d) are enough. Concerning 0^, it can be computed with a high precision by a few 
steps (usually less than 10) of the Newton- Raphson's optimization procedure (I25p . 

• For the sake of simplicity we used the classical Lipschitz continuous assumption on F but 
other error bounds can be derived by replacing Assumption 3 (i) by other smoothness assumption. 




2.2 Quantized importance sampling for Brownian diffusions 

In this section, we extend the Newton-Raphson's algorithm to the infinite dimensional setting, i.e. 
the case of path-dependent diffusion. We will rely on the Girsanov transform to play the role of 
mean translator. To be more precise, we consider a d-dimensional Ito process X solution to the 
stochastic differential equation (S.D.E.) 



dXt = b{t, X*) dt + a{t, X*) dWt, Xo = x€ R'^, {Eb,a, 



w ) 



I oo • 



W = iWt)t^[Q^T] being a g-dimensional standard Brownian motion and where X* := {Xt/\s)se[{)^T] 
is the stopped process at time t, h : [0,T] x C([0,T],M'^) -^R'^,a: [0,T] x C{[0,T],M.'^) M{d,q) 
are measurable with respect to the canonical predictable cr-field on [0,T] x C([0, T], 
Under the following assumption 

{i) 6(.,0) and (t(.,0) are continuous, 

(ii) VtG [0,r],Vx,y gC([0,T],]R'^), \h{t,y) - h{t,x)\ + \\a{t,y) - a{t,x)\\ < C,^^\\x - y\ 

strong existence and uniqueness of solutions for (Eb^^^yy) can be proved (for more details, see |21j). 
We aim at devising a robust and automatic Newton-Raphson's algorithm based on functional 
quantization inspired from Section 2.1 for the computation of 

E[F{X)] 

where -F is a Borel functional defined on C ([0,T],M'^) such that 

F{X) G (P) and F {F^{X) > O) > 0. (31) 

In this functional framework, the invariance by translation of the Lebesgue measure ([2]) is replaced 
by Girsanov Theorem. We consider a translation process given by G L'^ q '■= L'^{[0,T],W^) which 
is slightly less general than the ones used in [15]. Indeed, they considered translation processes of 
the form G(t,X*) defined for every ^ G C ([0,T],]R'^) and 6 G L'^{[0,T],RP) by 

e{t,O-=^{t,e)0t, where if : [0,T] x C {[0,T],R^^ ^ M{q,p), 

is a prespecified bounded Borel function. In what follows, we can easily adapt to this kind of 
translation processes but for the sake of simplicity, we prefered to focus on this simple case. 
It follows from Girsanov Theorem that for every 9 G L'^ 



K[F{X)] = E 
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where X^^^ denotes the solution to (-E'fe+o-6»,o-,VK)- Among all these estimators we want to select the 
one with the lowest quadratic norm so that we want to solve the following minimization problem 



min Q{9) where Q{0) ■- 



E 



-2/(f(e.,dvy.)-|i(?i|% 



Another Girsanov Theorem yields 

Q{e) :=E F^{X)e 



l|2 

11^2 



(32) 



In view of numerical implementation of a Newton-Raphson's algorithm to estimate a minimum 
of Q, we are lead to consider a (non trivial) finite dimensional subspace E of L'^p spanned by 
an orthonormal basis (ei, • • • ,em)- Like for the finite dimensional framework, our procedure will 
be based on the representation (as an expectation) of the first differential DQ and the second 
differential D^Q of Q on E combined with functional quantization of (-Eft^o-.v^)- 

Proposition 2.5. Assume that E [F(X)^"'"'^] < +co for some 6 > as well as assumptions (T-Lb^cr) 
and (j3T]) hold. Then the function Q defined by ([32]) is finite, strictly convex on ^ and 

lim Q{e) = lim Q{e) = +oo. 

T,q ^T,q 

Moreover, the function Q is twice differentiable at every 9 € and for every tp £ Lj.^, cp £ Lj^^ 



{DQ{e),i;)i 



E 



'T.q 



'L2 
T.q 



F2(X)e*(^) (Z)$(0),V)j 
{D^Q{e)^P,cl))=E[F^{X)e''^'^ ((Z)$(0), V')^^ ^ {0^9), + (^'^(^)V^, 

where ^ : — Jq {9s, dWs) + ^ ||^|lz,2 is twice differentiable with 



(33) 
(34) 



{D^9),i;), 



'T.q 



{il;s,dWs)+f {9s,^s)ds and {D^<^>i9)ij, 
Jo 



s,ips) ds. 



Proof. Owing to Holder's inequality of conjugate exponents {s,t) := (1 + 2)1 + f), we have for 
every 9 G 

Q(^) < E F(X)2+'5 E e' = E F(X) 



2+5 



I'-l < +00 



/ Jo{es,dWs)~^\ml2 \ „ 
since the Doleans exponential I e 1 is a true martingale for any 9 G Ljn ^. 

The function Q is strictly convex since the function ^ is and exp is strictly increasing and strictly 
convex. 

Now, using the trivial equality 



-/(f(0s,dH/.> + i||9|| 



T,q 



/„^(0„diy.>+i||e|i2 



1 rT 
' 2 



T.q 



^T.q 
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and the reverse Holder's inequality with conjugate exponents (5,-5) yields 



Q{e) > E 



E 



y^(es,dWs)-l\\e\\l2 



> E 



F3{X) 



T,q 



by the martingale property of the Doleans exponential. Let e > such that P [F'^{X) > s) > 0. 

We have Q{e) > e^E [1f^x)>s] = e^F (^^(^) > e)^ e"* ""^^s so that Q goes to infinity 

as 11^11x2 goes to infinity. 



The random functional <I> from into L'^{¥) (for all r > 1) is twice differentiable since 



1 

< - 

- 2 



T,q 



and, 



T,q 



where ijj 1— t- {D^{6),ip) is a bounded linear operator from into L'"(P). Using the Cauchy- 



"T,q 



Schwarz and the Burkholder-Davis-Gundy inequalities, its operator norm satisfies || 11)0(0)1 ||j;^2 l''(¥) — 



T,q^^ 



Cr{l + ll^ll ^2 )• Consequently, by the composition rule, we derive that ^ : 9 e*^^^ is twice diffen- 



tiable from into L^{¥) for any r > 1. Moreover, for every il^,(t> ^ -^t have (D^(0), 'i/')x,2 



T,q 



e*W (L)$(0),'0)^2^^, and {DH {6)1!^ , (f) = e*W (^(L>$(0), ^)^2^ ^ (D$(0), 0)^2 ^ + (1)2^(0)^,, 

We conclude that 6 1— t- (5(0) = IE [F2(X)e*'-^''] is twice differentiable with first and second 
differentials characterized by (I33p and (I34p . □ 



The target of the stochastic algorithm investigated in [15] is the minimum of the restriction of 
Q on E, 6*^ which satisfies DQ\^{9*^) = 0. Like for the static framework, in order to approximate 

DQ\e{6) and D'^Q\e{0) (which can be seen respectively as the m-tuple ( {DQ{9),ei) ) and 
as an m X m symetric positive definite matrix D'^Q^^{9) = (^D'^Q{9)ei,ej)^^.^^-^^^^^) for every 
9 £ E, we consider an (non-Voronoi) A^- functional quantization X of (-Eft^g-.w) given by (jl6p . Hence, 
for every 9 £ E, we approximate Qe{P)^ DQe{9) and D^Qe by respectively Qn{(^)-, DQn{9) and 
D^Qn{9) defined by 



Qn{0)=E 



DQNi9),ei 



"T,q 



D^QN{9)ei,e, 



E 
E 



F2(X)e*(^) (D%{9),e, 
F2(l)e*(^) ( (D${9),e. 



^T,q. 



\ ' " -/ Ll,. \ ^ '■ Vl2 



D^{9),eA , + [D''^{9)ei,e, 



T,q 



where %{9) = - {9.,, dt?A + i 



la 



T,q 



0^9), a 



^T.q 



(eiis),dWs) + /(f {9s,ei{s)) ds 



, m. 



and \^D'^^{9)ei,ejj = {ei,ej)^2^^, i = !,■■■ ,m, j = 1, 

Hence, we compute the minimum 9^ of Qtv by devising the following Newton-Raphson algo- 



rithm 



,+i = 9n-H{9ny^J{9n), n > 0, 9oeE given. 



(35) 
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where for every 6 e E, H{9)ij = (D'^QN{9)ei,ej] and J{9)i = lDQjq{6),ei) ^ , i = 1, - ■ ■ ,m, 
j = !,■■■ ,m. 

Remark 2.2. Like for the static framework (see Proposition 2.2 and Theorem 2.4), the convergence 
of (o^^ ^ toward 9^ and its convergence rate can be estabhshed under assumptions similar to 
the finite dimensional framework. 

3 Numerical illustrations 

In the following of this section, we illustrate the performance of these generic variance reduction 
algorithms both in the finite dimensional framework and in the diffusion framework. The numerical 
simulations are done in Scilab 5.3. 

3.1 Finite dimensional setting 

Basket options: We consider basket options with payoffs given by WiS^ — where 

{wi, ■ ■ ■ ,Wd) is the vector of weights, K denotes the strike, T is the maturity and is the price 
at maturity of the ith asset. We assume that each of the d assets under the risk- neutral measure 
has a price given by a Black-Scholes model driven by the vector of independent Brownian motions 

w={w,\--- ,Wf, t>0), 

where Z = (Z^, • • • , Z"') is a Gaussian vector of size d. We price this basket option with different 
values of the number of assets d and the strike K. The quantization grids have the same size 
= 200 and the number of Monte-Carlo simulations n is 100,000 in every case. Note that for each 
value of d and each value of the strike K, the prices are computed using the same pseudo-random 
number generator initialized with the same seed. 

The numerical results are reported in Tabled! In this table, the first two columns correspond to 
the different values of the dimension d and the strike K. The third and fourth columns correspond 
to the crude Monte-Carlo estimator and its associated variance. The fifth and sixth columns refer 
to the Monte-Carlo estimator and its variance using the optimal change of measure 9^ computed 
with our Newton-Raphson's algorithm. 

We can see in this example that our Quantization based Importance Sampling algorithm does 
reduce the variance by a factor varying from 6 up to 15. Note that it does not require any Monte- 
Carlo simulations to compute the optimal change of measure and unlike most adaptive importance 
sampling algorithm it does not need specific parameter tuning. One does not have to set up 
complicated adjustments when using it, it is fully generic and automatic. Hence, it is a very 
interesting variance reduction procedure to be used in an industrial way. 

Spark spread option: We consider now an exchange option between gas and electricity (called 
spark spread) with payoff given by (5|, — h^Sj, — where and Sj. denote electricity and gas 
spot prices at maturity T, hn is a heat rate and C is the generation cost. This kind of payoff 
appears in the pricing of power plant. We assume that the dynamic of electricity and gas spot 
prices follows the SDE: 

dSi = 9j (^aj - log S{dt + ajdW^ , j = e, g, 
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Table 1: Basket option in dimension d = 2, - ■ ■ ,6 with r = 0.05, T = I, SI, = 50, a' = 0.3, Wi = 1/d, i = 1, - ■ ■ ,d, 
N = 200 n = 100, 000 



d 


K 


Price MC 


Variance IVEC 


Price QIS 


Variance QIS 


Z 


OU 


0.4/0 




o.4yu 


( .OD 




f: c 
OO 


o.zy4 


4U.04 


Q QflO 

o.ouy 


A QQ 
4.00 




oU 


l.o M 


OA riQ 


l.OOO 


HQ 

z.Uo 


Q 


OU 


4. (01 


AO m 
4z.Ul 


4. /DU 


O.OO 
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60 


1.237 


12.38 


1.221 


1.01 


4 


50 


4.333 


32.03 


4.343 


4.64 




55 


2.086 


16.93 


2.089 


2.01 




60 


0.882 


7.29 


0.868 


0.58 


5 


50 


4.061 


26.05 


4.057 


3.99 




55 


1.781 


12.77 


1.777 


1.51 




60 


0.642 


4.63 


0.647 


0.35 


6 


50 


3.843 


22.10 


3.830 


3.50 




55 


1.566 


10.02 


1.553 


1.19 




60 


0.506 


3.25 


0.494 


0.22 



where and are two independent Brownian motions. The stochastic processes = log(S-'), 
j = g,e are Ornstein-Uhlenbeck processes: 

dXl = Ojifij — Xl)dt + ajdWj^ , where /x-,- = aj 

Writing spot prices as exponential of a sum of Ornstein-Uhlenbeck processes is a very common way 
to reproduce the mean reversion behavior of commodity spot prices. This model was first proposed 
by Schwartz in [22]. In this example, the dimension d is equal to 2. The quantization grids have 
the same size N = 200 and the number of Monte-Carlo simulations n is 100,000 in every case. 

The numerical results are summarized in Table [2] where we price the spark spread option for 
different values of C. We see that our Quantization based IS algorithm perfoms well again. In any 
case, the variance is divided by at least 13. Once again the Newton-Raphson algorithm proposed 
converges quickly, i.e. five iterations are enough to get a very accurate estimate of 6*^. 

Table 2: Spark spread option with T = 0.5, = 40 $/MWh, Sj? = 4 $/MMBTU (BTU: British Thermal Unit), 
fTe = 0.7, (Tg = 0.35, \e = Xg = 0.3, Oe = log(S'o), = log(5^), = 10 BTU/kWh, C = 0, 3, 5, 8, 10, 12 $/MWh, 
N = 200, n = 100, 000. 



c 


Price MC 


Variance MC 


Price QIS 


Variance QIS 





7.933 


221.01 


7.957 


16.48 


3 


6.681 


189.24 


6.757 


13.32 


5 


6.024 


176.93 


6.049 


11.54 


8 


5.081 


153.44 


5.083 


9.16 


10 


4.575 


141.09 


4.531 


7.81 


12 


4.057 


125.49 


4.032 


6.61 



3.2 Infinite dimensional setting 

We consider three different basis of L^([0, 1],M) 

• a polynomial basis composed of the shifted Legendre polynomials {Pn)n>o defined by 

1 d" 

Vn > 0, Vt G [0, 1], Pnit) = Pn{2t - 1) where Pn{t) = - l)) • (ShLeg) 

2"'n! at^ 
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• the Karhunen-Loeve basis defined by 



Vn > 0, Vt G [0,1], e„(t) = ^Pl 




(KL) 



• the Haar basis which is defined by 

Vn > 0, VA; = 0, ...,2" - 1, Vt G [0, 1], Vn,fc(i) = 2^{2''t - n) (Haar) 

where 

f 1 iftG [0,i) 

m = l -1 if^e[i,i) 

otherwise. 

Asian option: The considered payoff is an Asian option on a discrete time schedule of obser- 
vation dates to < • • • < ^p-i = T with payoff Z^fc=o ~ • 

Black-Scholes Model: First we consider that S follows the classical Black-Scholes model with interest 
rate r = 4% and volatility a = 50%. The strike of the option is set at K = 115, the maturity at 
T = 1, p = 100 observation dates and the initial price So = 100. For the different basis mentioned 
above and different values of m (2, 4 and 8), the results of our algorithm are summarized in Table 

SI ^ 

Note that since for all t e [0,r], St = e''r-^)t+aWt^ ^r numerics we consider the non-Voronoi 
St defined by 

2 ^ ^ 2 

St := e('^-V)*+-^^ = ^e(^-^)*+">^»«lc,(x,)(^)' ^ ^ 1' (36) 

instead of approximating the solution of the ODE given by ()15l) . 

We set the optimal product quantizer at level = 966 which corresponds to the optimal 
decomposition A^i = 23, N2 = 7, A3 = 3, A4 = 2 for the problem (jl4p (see [16] for more details). 
The number of Monte-Carlo simulations n is 100,000 in every case. Note once again that for each 
basis and each value of the dimension m, the prices and the variances are computed using the same 
pseudo-random number generator initialized with the same seed. In Figure 13.21 are depicted the 
optimal variance reducer 0^ when the minimization of Qn is carried out on Em = span(ei, • • • , em) 
for several values of m in the different basis mentioned above. 

Table 3: Asian option in the Black-Scholes model with So = 100, K = 115, T = I, p = 100, a = 50%, r = 4%, 
div = 966, n = 100, 000. 



Basis 


m 


Price MC 


Variance MC 


Price QIS 


Variance QIS 


Constant 


1 


7.112 


293.98 


7.006 


79.07 


Legendre 


2 


7.179 


296.52 


7.062 


22.46 


(ShLeg) 


4 


7.033 


290.71 


7.093 


22.17 




8 


7.180 


300.72 


7.096 


22.25 


Karhunen-Loeve 


2 


7.102 


296.91 


7.043 


83.81 


(KL) 


4 


7.066 


295.76 


7.034 


74.31 




8 


7.082 


290.51 


7.096 


37.69 


Haar 


2 


7.104 


293.13 


7.035 


33.14 


(Haar) 


4 


7.110 


297.77 


7.074 


25.07 




8 


7.136 


299.64 


7.065 


23.17 
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Figure 1: Asian option: Optimal 6 obtained by our algorithm in the case of the Black-Scholes 
model for different basis and several values of m {m = 2 for the left upper curves, m = 4 for the 
right upper curves and m = 8 for the lower curves) . 



Local- Volatility Model: Now, we consider the same payoff function in a local volatility model 
(inspired by the CEV model) defined by 

AXt = rXtdt + aXt ' ,^ AWt, Xq = x, (37) 

+ X^ 

with r = 0.04, o" = 5, x = 100 and /? = 0.5. For numerics, the solution of the ODE given by (jlSp 
is approximated by a sixth order Runge-Kutta scheme. The number of Monte-Carlo simulations n 
is 50,000 in every case. The numerical results are summarized in Table HI 

Schwartz's Model: Again we consider the same payoff function in the Schwartz model with r = 0.04, 
a = 50%, 5*0 = 100, a = log(S'o), A = 0.3. The number of Monte-Carlo simulation n is 100,000 in 
every case. The numerical results are summarized in Table [H Note that since the spot price can 
be written 

5^ ^ glog(5o)e-«*+M(l-e-«')+yt ^-^^^ = -Ytdt + adWu Yq = 0. 

Hence, to quantize the diffusion S, we just have to obtain a (rate optimal) iV-product quantizer 
of the centered Ornstein-Uhlenbeck process Y . It is given by 

Vii't) = (^^Xi^Cn^Pn{t), i G n„>i {1, • • • ,iV„} , 

n>l 
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Table 4: Asian option in the Local Volatility model with x = 100, K = 115, T = 1, p = 100, a = 5, r = 4%, 
13 = 0.5, dN = 966, n = 50,000. 



Basis 


m 


Price MC 


Variance IVIC 


Price QIS 


Variance 


Constant 


1 


6.635 


205.69 


6.681 


58.63 


Legendre 


2 


6.646 


204.84 


6.619 


16.94 


(ShLeg) 


4 


6.593 


206.52 


6.661 


16.84 




8 


6.537 


203.39 


6.627 


17.29 


Karhunen-Loeve 


2 


6.562 


203.73 


6.620 


66.55 


(KL) 


4 


6.627 


205.24 


6.578 


53.25 




8 


6.700 


207.31 


6.637 


32.61 


Haar 


2 


6.583 


204.65 


6.651 


26.06 


(Haar) 


4 


6.535 


203.09 


6.669 


18.82 




8 


6.679 


206.39 


6.656 


17.48 



QIS 



where Xn '■= [x-^ • • • ^Xj^'^j is the unique optimal A^^-quantizer of the normal distribution, c„ = 

^^^n-imf+m' ' = \/? - 1/2) sin (7r(n - 1/2)^) + 6 (cos (7r(n - 1/2)^) - e^^*)). Con- 

sequently we save computation time since we don't need to devise a Runge-Kutta scheme. 



Table 5: Asian option in Schwartz's model with So = 100, K = 115, T = 1, p = 100, a = 50%, r = 4%, a = log(S'o), 
A = 0.3, dN = 966, n = 100, 000. 



Basis 


m 


Price MC 


Variance MC 


Price QIS 


Variance QIS 


Constant 


1 


5.012 


173.59 


4.905 


40.07 


Legendre 


2 


5.029 


173.80 


4.952 


11.74 


(ShLeg) 


4 


4.980 


170.77 


4.978 


11.78 




8 


5.091 


180.02 


4.962 


12.00 


Karhunen-Loeve 


2 


4.928 


171.54 


4.960 


44.55 


(KL) 


4 


4.974 


171.92 


4.956 


39.48 




8 


4.980 


171.64 


4.961 


21.50 


Haar 


2 


4.999 


173.55 


4.944 


17.47 


(Haar) 


4 


5.027 


175.04 


4.949 


13.28 




8 


4.932 


169.83 


4.970 


12.28 



Down & In Call option: We consider an Down &: In Call option of strike K and barrier L. 
This option is activated when the underlying process X moves down and hits the barrier L. The 
payoff function at maturity T is defined by 



F{X) = (Xt - K)+l 



{mino<t<T ^t<L} 



A standard approach to price the option is to consider the continuous Euler scheme X of step 
tk = kj^ obtained by extrapolation of the Brownian Motion between two instants of discretization. 
For every t G we can write 

Xt = Xt, + h{Xt,){t - tk) + a{Xt,)iWt -WtJ, Xo = xo£ M. 
By preconditioning, 



E 



{Xt - 



E 



{Xt-ku[i- llp{Xt„Xt,^,: 



k=0 



where p{xk,Xk+i) = F {mmt^<t<tk+i > L\{Xt^,Xt,^_^^) = {xk,Xk+i)) is the probability of non 
exit of some brownian bridge. Using the law of the brownian bridge (see for example [9]), we can 
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write 



p{Xk,Xk+l) 



L — Xk 
mill Wt < — 7 — r- 

te[o,h] a{xk) 



Xk+1 ~ Xk 

a{xk) 



1 _ e ('fe+i-*fc)<^ (=:fc) if L < min(2;fc, rcfc+i), 
, otherwise. 



(38) 
(39) 



Hence, we run our algorithm with this modified payoff function: (Xt—K)^ i,^ ~ '^k^o^ Pi-^tk ' -^t^+i) 
We set the number of steps M = 100. In the following simulations, we consider the local volatility 
model ()37p and the classical Black-Scholes model. The results are summarized in Tableland Table 
[71 In Figure 2 are depicted the optimal variance reducer for the local volatility model. Our numer- 
ical results illustrate the effectiveness of our Newton-Raphson's IS algorithm. In this example, the 
computation time needed to achieve a given precision is divided by a factor 8 in comparison with 
the crude Monte Carlo estimator. 



Table 6: Down&In Call option in Local volatility model with Xo = 100, K = 115, L = 65, T = 1, cr = 5, r = 4%, 
13 = 0.5, div = 966, n = 50, OOP, M = 100. 



Basis 


m 


Price MC 


Variance MC 


Price QIS 


Variance 


Constant 


1 


0.684 


25.80 


0.673 


15.33 


Legendre 


2 


0.711 


27.92 


0.662 


4.58 


(ShLeg) 


4 


0.683 


25.66 


0.684 


3.46 




8 


0.686 


26.59 


0.685 


3.35 


Karhunen-Loeve 


2 


0.680 


25.38 


0.696 


5.22 


(KL) 


4 


0.702 


26.39 


0.683 


6.53 




8 


0.687 


26.39 


0.688 


5.80 


Haar 


2 


0.648 


24.90 


0.673 


8.15 


(Haar) 


4 


0.671 


25.15 


0.692 


5.46 




8 


0.709 


30.17 


0.700 


5.29 



QIS 



Table 7: Down&In Call option in the Black-Scholes model with 5*0 = 100, K = 115, L = 65, T = 1, cr = 50%, 
r = 4%, div = 966, n ^ 100, 000, M ^ 100. 



Basis 


m 


Price MC 


Variance MC 


Price QIS 


Variance QIS 


Constant 


1 


0.481 


21.76 


0.467 


8.62 


Legendre 


2 


0.455 


19.25 


0.469 


3.54 


(ShLeg) 


4 


0.474 


21.03 


0.477 


3.43 




8 


0.451 


19.96 


0.470 


3.39 


Karhunen-Loeve 


2 


0.466 


21.36 


0.459 


5.51 


(KL) 


4 


0.470 


22.37 


0.471 


5.78 




8 


0.462 


21.93 


0.465 


5.20 


Haar 


2 


0.471 


22.16 


0.473 


7.38 


(Haar) 


4 


0.469 


22.08 


0.464 


5.03 




8 


0.470 


21.09 


0.476 


5.26 
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